Introducing Symmetries to Black Box Meta Reinforcement Learning

نویسندگان

چکیده

Meta reinforcement learning (RL) attempts to discover new RL algorithms automatically from environment interaction. In so-called black-box approaches, the policy and algorithm are jointly represented by a single neural network. These methods very flexible, but they tend underperform compared human-engineered in terms of generalisation new, unseen environments. this paper, we explore role symmetries meta-generalisation. We show that recent successful meta approach meta-learns an objective for backpropagation-based exhibits certain (specifically reuse rule, invariance input output permutations) not present typical systems. hypothesise these can play important Building off work supervised learning, develop system same symmetries. through careful experimentation incorporating lead with greater ability generalise action & observation spaces, tasks,

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Policy Improvement: Between Black-Box Optimization and Episodic Reinforcement Learning

Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. There are two main approaches to performing this optimization: reinforcement learning (RL) and black-box optimization (BBO). In recent years, benchmark comparisons between RL and BBO have been made, and there have been several attempts to specify which approach works best for which types o...

متن کامل

From Black-Box Learning Objects to Glass-Box Learning Objects

In the field of e-learning, a popular solution to make teaching material reusable is to represent it as learning object (LO). However, building better adaptive educational software also takes an explicit model of the learner’s cognitive process related to LOs. This paper presents a three layers model that explicitly connect the description of learners’ cognitive processes to LOs. The first laye...

متن کامل

Learning in a Black Box ∗

Many interactive environments can be represented as games, but they are so large and complex that individual players are mostly in the dark about others’ actions and the payoff structure. This paper analyzes learning behavior in such ‘black box’ environments, where players’ only source of information is their own history of actions taken and payoffs received. The context of our analysis are dec...

متن کامل

Some Considerations on Learning to Explore via Meta-Reinforcement Learning

We consider the problem of exploration in meta reinforcement learning. Two new meta reinforcement learning algorithms are suggested: EMAML and E-RL. Results are presented on a novel environment we call ‘Krazy World’ and a set of maze environments. We show E-MAML and E-RL deliver better performance on tasks where exploration is important.

متن کامل

Policy Improvement Methods: Between Black-Box Optimization and Episodic Reinforcement Learning

Policy improvement methods seek to optimize the parameters of a policy with respect to a utility function. There are two main approaches to performing this optimization: reinforcement learning (RL) and black-box optimization (BBO). Whereas BBO algorithms are generic optimization methods that, due to there generality, may also be applied to optimizing policy parameters, RL algorithms are specifi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2022

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v36i7.20681